


Practical Neural, 
Networks (3) 


Part 3 — Feedback Nets and Competitive Nets 


By Chris MacLeod and Grant Maxwell 





This month we look at two more advanced neural nets. The Hopfield 
network, which uses feedback in its structure, and the Competitive net 
which can recognise patterns in data, even if the programmer doesn’t 


know they exist. 
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Figure |. A Hopfield Net. 


In 1983 a physicist named John Hopfield 
published a famous paper on neural nets. 
This paper helped to re-ignite the field, 
which had been languishing in the doldrums 
for some time. 


Actually, the ANN which bears his name 
— the Hopfield Network — is of somewhat 
limited practical use, but it does help us to 
understand the ins and outs of neural net 
behaviour. 

What Hopfield did was to create a net- 
work with feedback — with connections from 
the outputs, back towards the inputs. Figure 
1 shows the idea. The network always has a 
single layer and the same number of inputs 
as neurons. 
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Forward pass operation 


The neurons operate in the same 
way as the binary ones described in 
part 1 (except that they produce a 
—1 and +1 output, rather than 0 and 
1). The only difference in operation 
is that the output of the network is 
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fed back to the input once it’s been 
calculated and so goes through the 
network again. Eventually, if the 
network has been trained properly, 
the output will become constant 
(the inputs and outputs will be the 
same). The network is said to have 
relaxed. The network is then ready 
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Figure 2. Running data through a Hopfield net. 
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Figure 3. Operation of a Hopfield net. 
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Figure 4. Applying images. 


for you to read its outputs. Figure 2 
shows this process in the form of a 
flow chart. 


Uses 


Before going any further, it’s worth 
pausing to consider what it is that 
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the Hopfield network can do, which 
the BP network can not. A Hopfield 
network, rather than just recognis- 
ing an image, can store and retrieve 
patterns — it has a memory. We can 
input a corrupted image into the 
network and it will reproduce the 
perfect stored version. Figure 3 
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Figure 5. Worked example of Hopfield training. 
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shows the idea. 

Once the network is trained properly, all 
we have to do is present it with the corrupted 
version as its inputs and then wait until the 
network stops cycling as described above. 
Once this has happened, we can read the 
outputs of the network and they will give us a 
reconstructed image, see Figure 4. 

In the original Hopfield net, all inputs and 
outputs are —1, which could represent, say, a 
white pixel and +1 for a black pixel. Net- 
works with continuous outputs are today 
more common, but for our discussion, we’ll 
stick with the simple case. 


Training 


Now that we know what the Hopfield net- 
work does, let us turn our attention to how it 
can be trained. 

Compared with the Back Propagation net- 
work, training the Hopfield is easy. All the 
weights are calculated using a simple for- 
mula: 


Winn = X Om On Over all patterns. Make 
weights Wan = 0. 


Where W,, y is the weight of the connection 
between the my input and the n,, neuron 
and O, is the ny, output desired from the net- 
work. 

In other words, to find the weight of the 
connection between input m and neuron n, 
take each pattern to be trained in turn and 
multiply the Mm, output by the n,, output and 
add them all together. As usual this is best 
illustrated by example, see Figure 5. 

Let’s say we'd like to train three patterns: 


Pattern number one: 


Oaa = -1 Osa) = -1 Oca) = 1 














Pattern number two: 


Ona) = 4 OB(2) = -1 Oca) =-1 














Pattern number three: 


LT 


OA(3) = -1 OB(3) = 1 OC(3) =1 





W141 = 0 

W1,2 = Oca) X Oga) + Oya) X Og) + Oya) X 
Op) = C1) x (1) + 1x (1) + —1)x 1 = 
-1 

w1,3 = Oat) X Oca) + Oag) X Oc) + Oa) X 
Ocg) = 1) x 1 + 1x (-1) + (1)x 1 = -3 
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w22 = 0 

Wa,1 = Opry X Oac1y + Opa) X aca) + Onga) X 
Oa) = (-1) x (-1) + (-1) x 1 + 1x (-1) = 
-1 

W23 = Ob(1) x 0.1) + Oya) x Oqa) + Op(3) x 
Ogg) = (-1) x 1 + -1)x (-1)+1x1=1 


wAacgwzo 
AHaCVHCO 


W3 3 =0 
W3,1 = Oe(1) X Oac) + Oet) X Oac) + Oea) X 
020324 - 3 - 16 
Oaa) = 1x (-1) + (1) x 1 + 1x (-1) = -3 
w3,2 = Or) X Onga) + Oet) X Ong) + Oea) X l 
Opa) = 1x (-1) + (-1)x (-1)+1x1=1 Figure 6. A general neural net. 


Unlike BP training, the calculations are done 
only once and are not repeated. Li sti n g i 
We can write a simple algorithm to set the 
weights for a Hopfield as shown in Listing 1. FOR f = 1 TO no of inputs 


Where the same variables are used as FOR t = no of inputs + 1 TO no of inputs + no of outputs 
shown in the forward pass example in part 1 FOR p = 1 TO no of patterns 
(where the weights are held in a two-dimen- w(t 
sional array). The desired outputs are held in NEXT 
an array i(pattern_no, pixel number). IF t 


t) =w(f, t) + i(p, £) * i(p, t - no_of inputs) 


t wos 


no_of inputs + f THEN w(f, t) = 0 
NEXT t 


Capabilities NEXT f 


So the Hopfield net has a memory. But what 
else can it do? Actually, its practical applica- 
tions are a little limited, but it tells us a lot 
about the general capabilities of neural nets. 

In part 1, we discussed the similarity of 
the feed forward network to combinational 
logic. But the ANN is logic which can produce 
any truth table by learning, rather than 
detailed design. Similarly, the analogy for the 
Hopfield is sequential logic. After all, a 
flip/flop like a JK or SR is a simple memory 
and this is also achieved through the use of 
feedback. 

In fact, the Hopfield can produce time- 
series, oscillatory or even chaotic outputs if 
you let it; although the training illustrated 
above is designed always to produce a stable 020324-3 - 17 
network — the outputs always decay to a 
steady state. 

The simple Hopfield net illustrated here 
has limitations. It is prone to local minima 
(which in this case means that it may have Input 1 Input 2 
difficulty reconstructing some patterns), so 
more sophisticated training algorithms have 
been developed. For example, there are vari- 
ants of the Back Propagation algorithm which 
can be used to train Hopfield nets using tar- 
gets (like the BP networks in part 2), but 
these don’t guarantee stability. 

We can extend the capabilities of the sim- 
ple Hopfield if we add an extra layer. Such 
networks are known as Bi-directional Asso- 
ciative Memories (BAMs) and can associate 
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Figure 7. A simple competitive net. 


an input with a different memory. But beyond Output 1 Output 2 Output 3 
this, the structure of the Hopfield net is too 020324 - 3 - 18 
rigid, we need to use its lessons to devise 

more general nets. Figure 8. A winning neuron. 
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General Neural Nets 


We've seen how the Hopfield net is 
more general than the simple feed- 
forward type. In fact the feedback 
type just degenerates to a feedfor- 
ward net if the feedback paths are 
set to zero. 

You might guess therefore, that 
the most general neural nets would 
have a mixture of both feedback and 
feedforward connections. In fact this 
is true. In the most general network, 
any neuron could be connected to 
any other, Figure 6 shows the idea. 

Training such networks is tricky, 
as algorithms like the Hopfield Train- 
ing illustrated above and even Back 
Propagation only operate when the 
network has a defined and limited 
structure. To train a network where 
any neuron may be connected to any 
other demands more advanced algo- 
rithms. 

Perhaps the easiest algorithm to 
employ and certainly the most com- 
mon in such circumstances is the 
Genetic Algorithm. One can employ 
the algorithm to choose the weights 
in a general network in the same 
way as one can use it to choose com- 
ponent values in the examples given 
in that article, the fitness of the net- 
work being the inverse of its error. 
The details of such advanced train- 
ing methods can wait for a future 
article. 


Competitive Learning 


Now, let’s look at a quite different 
network. You'll remember that in 
part 2, we mentioned that probably 
80% of neural nets used in practice 
were Feedforward, Back Propagation 
nets. This leaves the question of the 
remaining 20%. Well, most of these 
are of a type of network known as a 
Competitive or Winner-Takes-All 
Net. 


Operation 


The Competitive net is best illus- 
trated by example. Suppose we have 
a network of three neurons as shown 
in Figure 7. 

The neurons work in exactly the 
same way as those already 
described in part 1, except that we 
don't need to apply a threshold or 
sigmoid function. 
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We won't worry too much at this 
stage about the set up of the 
weights except to say that they are 
essentially random. 

Now let us apply a pattern to the 
network. Just by chance (since the 
weights are random), one of these 
neurons will have a higher output 
than the others — let’s say it’s neu- 
ron three, as shown in Figure 8. 

We say that this neuron has won 
and set its output to 1 and the oth- 
ers to zero. 

Now we train only the weights of 
neuron 3 (the ones shown in bold), 
so that, if this pattern comes along 
again it will have an even higher out- 
put — it will win even more easily. 
So neuron three will always fire 
when this pattern comes along. In 
other words, neuron three recognises 
the pattern. This is very simple to 
do; we just update the weights with 
this formula: 


Wt = W + 7(Input - W) 


Where W+ is the new (trained 
weight) and W is the original 
weight, Input is the input feeding 
that weight and ņ is a small con- 
stant, much less than 1 (say 0.1). 

Of course if another, completely 
different, pattern comes along a dif- 
ferent neuron will win and then this 
new neuron will get trained for that 
pattern and so the network self 
organises so that each neuron fires 
for its own pattern. 


Uses 


Suppose we let a competitive net- 
work loose on some data — let’s say 
from the financial markets. The net- 
work would organise itself to find 
patterns in the data. Exactly what 
these patterns are, we don’t know, 
the network decides for itself — we 
don’t give it examples like a Back 
Propagation network. This is both 
the attraction and the disadvantage 
of the Competitive network — you 
might find an important pattern in 
the data which you didn’t know 
existed — but it could miss what 
you're looking for and find some 
other unwanted patterns to recog- 
nise. 

In the same way and related to 
this, the network will fire the same 
neuron for patterns it finds similar 
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Figure 9. The inputs shown on a graph. 
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Figure 10. The weight vector of neuron 3. 


(even although the similarity may not be obvi- 
ous to the user). We can say therefore that, 
whereas the Back Propagation network is 
trained by the user to recognise patterns, the 
Competitive net trains itself to classify pat- 
terns. 

Of course you could use the competitive 
neuron to recognise patterns like Back Prop- 
agation. But this seems rather a waste of 
effort since BP works extremely well and is 
generally easier to set up than a competitive 
net. 


More detail 


To understand some of the subtle features of 
the competitive system, we need to examine 
its operation a little more closely. To do this, 
let's look at the network shown in Figures 7 
and 8 more closely. 

The network has two inputs and it’s pos- 
sible to represent these as a line (called a vec- 
tor) on a graph, where y is the value of input 
1 and x input 2. This is shown in Figure 9 (of 
course this applies to any number of inputs, 
but two are easy to visualise). 

The length of this vector by Pythagoras is: 





Length = | (inputl)? +(input2)? 
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We can also plot a line representing the 
weights of neuron 3 on the same graph by 
making its two weights the x and y coordi- 
nates, as shown in Figure 10. 

Now, when we work out the output of the 
neuron (i;w, + igw2), what we are actually 
doing is calculating what’s known as the dot 
product — which can be considered a mea- 
sure of how similar the two vectors are. If 
both the vectors were exactly the same (one 
lying on top of the other) the output would be 
larger than if they were different. 

If all the vectors were the same length, 
then we'd just be measuring the angle 
between them (which would make matters 
easier, as it means that we don't have to take 
length into consideration), so that is what we 
do. We can make all the vectors one unit long 
by dividing them by their length. 

Now, consider the weight vectors for all 
three neurons in the network, Figure 11. 
These have all been normalised to one unit as 
described above. 

Neuron 3 has won because it is closest to 
the input and therefore has the largest dot 
product (it is most similar to the input). What 
the training does, is move the weight vector 
of neuron 3 even closer to the input, as 
shown in Figure 12 (remember that only the 
weights of neuron 3 are trained). 

This, of course, makes it likely that, if a 
similar pattern comes around again, neuron 
3 will fire. 

The training formula W+ = W + n(Input - 
W) doesn’t preserve the unit length of the 
weight vector it’s operated upon; so after 
using it, you should divide the weight vector 
of the winning neuron by its length to make 
it one unit again. 

You can probably see that the distribution 
of the weights in this type of network is quite 
critical and so it helps to consider the distri- 
bution of vectors around the origin when set- 
ting up the network to ensure an even cover- 
age. 


Networks based 


on Competitive Neurons 
Competitive neurons are seldom used just on 
there own, but form the mainstay of several 
more complex networks. They are often laid 
out in the form of a 2D grid as shown in Figure 
13. This is known as a Kohonen Self Organ- 
ising Map. 

What happens in this case is that the win- 
ning neuron (shown in black) is fully trained 
and the surrounding neurons (shown in grey) 
are partially trained (by making 7 in the for- 
mula, a smaller number). 

When the network has been allowed to 
train in this way the result is that it forms a 
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Figure | |. Weight vectors for all three neurons. 









Weight, vector 


Weight, vector 


Figure | 2. Effect of training. 
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Figure 13. A Self Organising Map. 


map in which most similar patterns 
are grouped together and are far 
away from the less similar ones. 
Another very advanced network 
based on Competitive neurons is 
Adaptive Resonance Theory (ART). 
This network can change its size 
and grow as it learns more patterns. 
In the final part of the series we'll 
have a look at some of the other 
applications of neural nets and some 
of the more advanced topics which 
researchers are wrestling with. 
(020324-3) 
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“winning” neuron, 
fully trained. 





Adjacent neurons 
partially trained. 
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